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Abstract — This work considers the problem of quickest detec- 
tion of signals in a coupled system of N sensors, which receive 
continuous sequential observations from the environment. It 
is assumed that the signals, which are modeled a general ltd 
processes, are coupled across sensors, but that their onset times 
may differ from sensor to sensor. The objective is the optimal 
detection of the first time at which any sensor in the system 
receives a signal. The problem is formulated as a stochastic 
optimization problem in which an extended average Kullback- 
Leibler divergence criterion is used as a measure of detection 
delay, with a constraint on the mean time between false alarms. 
The case in which the sensors employ cumulative sum (CUSUM) 
strategies is considered, and it is proved that the minimum of N 
CUSUMs is asymptotically optimal as the mean time between 
false alarms increases without bound. 

Keywords: Kullback-Leibler divergence, CUSUM, quick- 
est detection 

I. INTRODUCTION 

We are interested in the problem of quickest detection 
of the onset of a signal in a system of N sensors. We 
consider the situation in which, although the observations 
in one sensor can affect the observations in another, the 
onset of a signal can occur at different times (i.e., change 
points) in each of the N sensors; that is, the change points 
differ from sensor to sensor. As an example in which this 
situation arises consider a system of sensors monitoring the 
health of a physical structure in which fault conditions are 
manifested by vibrations in the structure. Before a change 
affects a given sensor, we have only noise in that sensor. 
Then, after a change, the system is vibrating and thus the 
signal received in any location reflects a vibrating system. 
Thus, observations at any given sensor are coupled with 
those received in other locations. The change points observed 
at different sensors can occur at different times because 
the source of the vibrations (i.e., the excitation) may arrive 
at different structural elements at different times. Relevant 
literature related to such models includes, for example, [1]- 
[3], [5], [7], [8], [15]. 

We assume that the probability law of the observations is 
the same across sensors. This assumption although seemingly 
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resttictive, is realistic in view of the fact the system of 
sensors is coupled. We model the signals through continuous- 
time Ito processes. The advantage of such models is the 
fact that they can capture complex dependencies in the 
observations. For example, an autoregressive process is a 
special case of the discrete-time equivalent of an Ornstein- 
Uhlenbeck process, which in turn, is a special case of an Ito 
process. Other special cases of this model include Markovian 
models, and linear state-space systems commonly used in 
vibration-based structural analysis and health monitoring 
problems [l]-[3], [5], [7], [8], [15]. It is important to stress 
that the fact that the system of N sensors is coupled makes 
the probabilistic treatment of the problem equivalent to 
the one in which all observations become available in one 
location. The reason is that one integrated information flow 
is sufficient for describing such a system. 

Our objective is to detect the first onset of a signal in such 
a system. So far in the literature of this type of problem 
(see [12], [19]-[22]) it has been assumed that the change 
points are the same across sensors. Recently the case was 
also considered of change points that propagate in a sensor 
array [17]. However, in this configuration the propagation 
of the change points depends on the unknown identity of 
the first sensor affected and considers a restricted Markovian 
mechanism of propagation of the change. 

In this paper we consider the case in which the change 
points can be different and do not propagate in any specific 
configuration. The objective is to detect the minimum (i.e., 
the first) of the change points. We demonstrate that, in the 
situation described above, at least asymptotically, the mini- 
mum of N CUSUMs is asymptotically optimal in detecting 
the minimum of the N different change points, as the mean 
time between false alarms tends to oo, with respect to an 
appropriately extended Kullback-Leibler divergence criterion 
criterion [11] that incorporates the possibility of N different 
change points. 

In the next section we formulate the problem, discuss 
special cases of our Ito models and demonstrate asymptotic 
optimality (as the mean time between false alarms tends to 
oo), in an extended min-max Kullback-Leibler sense, of the 
minimum of N CUSUM stopping times. We finally discuss 
extensions of these results to the case of different structures 
of observations in each sensor. 

II. FORMULATIONS & RESULTS 

We sequentially observe the processes {Z^;t > 0} for 
all i — 1,...,N. In order to formalize this problem we 
consider the measurable space (f2, T), where fl = C[0, oo]^ 



and T = U t>0 ^ r t with T t = o{s < t; Z, 



(i) 



}■ 



The processes {Z^;t > 0} for all i 
assumed to have the following dynamics: 



,N are 



(1) 



af> dt + dwf ] 



t > Ti, 



where {af ;t > 0} is a process on the same probability 
space adapted to the filtration {Ft} and {w^;t > 0} are 
independent standard Brownian motion. The case considered 

(i) 

in this paper that in which a\ is the same for all i. This 
can be described as a signal symmetry across sensors. 

We notice that {J~t} is the filtration generated by the 
observations received by all sensors. Thus by requiring that 
otf* be Tt -measurable for all i , we have managed to capture 
the coupled nature of the system. In particular, in the special 



case in which, say, a 



(i) 



J2iLi © describes a 



process which displays an autoregressive (or its continuous 
equivalent [13]) behavior in {z[ ;t > 0}, while still being 
coupled with the observations received by the other sensors. 
More specifically, the magnitude of each increment of the 
process {Z^;t > 0} at each instant t is not only affected 
by Zt ' but also by z[ l \ i = 2,...,N the observations 
at sensor 2, . . . , N. This couples the observations received 
in sensor 1 with those received in sensors 2, . . . ,N at each 
instant t and results in a system of interdependent sensors. 
We notice that the special case described above can also be 
written in the form of a linear state-space model as follows: 
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dt 



(N) 



dW i 



(i) 



(N) 



dW t 

Autoregressive models and, more generally, linear state space 
models have been used to capture seismic signals, navigation 
systems, vibrating mechanical systems, etc. (see, e.g., [4]). 
Another special case of (Q~|i is 



Z 



,(i) 

(2) 



1 

-1 



dt - 



dW\ 
dW i 



(i) 



(2) 



a model that describes sinusoidal waves driven by noise. 
Such a model could also be used to capture vibrating 
mechanical systems. The generality of Q] however is much 
greater than the special cases described above. This is seen 
in the fact that a\ at each instant t can depend on the 
totality of the observed paths of each of the signals received 
up to time t. 

On the space 51, we have the following family of probabil- 
ity measures {P Ti ,....t n }, where P Tl ,..., TN corresponds to the 
measure generated on £1 by the processes [Z\ f , . . . , z[ N ^) 
when the change in the A^-tuple process occurs at time point 



sponds to the measure generated on il by N independent 
standard Brownian motions. 



Our objective is to find a stopping rule T that balances 
the trade-off between a small detection delay subject to a 
lower bound on the mean-time between false alarms and will 
ultimately detect min{ri, . . . ,tjv}. In what follows we will 
use f to denote min{n, . . . ,tn}- 

To this effect we propose a generalization of the J KL of 
[11], namely 



sup essup E Tli 

Ti,...,Tjv 

(2) 




T N \ 

5>«) 2 ds 

» i=l J 



where the supremum over n, . . . , tjv is taken over the set 
in which min{ri, . . . ,tn} < oo. That is, we consider the 
worst detection delay over all possible realizations of paths 
of the iV-tuple of stochastic processes {Z\ \ , . . . , z[ ) up 
to min{ri, . . . ,tn} and then consider the worst detection 
delay over all possible A^-tuples {ti, . . . , tjv} over a set in 
which at least one of them is forced to take a finite value. 
This is because T is a stopping rule meant to detect the 
minimum of the N change points and therefore if one of the 
N processes undergoes a regime change, any unit of time by 
which T delays in reacting, should be counted towards the 
detection delay. This gives rise to the following stochastic 
optimization problem: 



(3) 



inf j£2(T), subject to 



7- 



The criterion in (f2]l can be similarly motivated by consid- 
ering the average over all sensors of the Kullback-Leibler 
divergence: 



(4) 



= E T 



1 dP 

_L iog ^ T1 — TW 

N dP 



N ^-i 

i=l 



i(a«) 2 dr|J> 



where the last equality follows as long as 



(5) E T 



{a^fdr 



J-f\<oo a.s. 



for all i = 1, . . . , N and all t < oo. 

Using an argument similar to the randomization argument 
of [6], it is also possible to show that the optimal stopping 
rule T* must be an equalizer rule in that it would react at 
exactly the same time regardless of which change takes place 
first. In order to demonstrate this fact we begin by noting that 
minimization of (O is equivalent to minimizing 



sup essup E Tl . 

Ti,...,T N 

(6) 



Now define 



jr>(T) 



sup essupS Tli ... iTjv 



(T)} 



for i = 1, . . . , N. That is, J> N '(T) is the detection delay of 
the stopping rule T when n < min^lr,}. Then 

jW(T)=ma X {jW(T),jf)(T) ) . 
(7) 

The optimal solution to ([3J, T*, satisfies 

(8)J X W (T*) = 4 N) (T*) = ... = J { x ] (T*). 

To see this, let us consider the case when N = 2. Let T 
be a stopping rule such that j[ 2 \T) < j| (T). Consider 
another stopping rule 5, which stops as T does, but observes 



In the case that N = 1, in which the drift denoted by at is 
measurable with respect to the filtration generated by only 
one process, say {Z t ;t > 0} the CUSUM stopping rule 
([Tol l is optimal in minimizing the Kullback-Leibler diver- 
gence criterion of [11] subject to the false alarm constraint 
Eoc{h J T " a 2 dt} > 7. The v in ( fTOb is chosen so that 

E °o A / T " of <ft} = /(i/) = 7. with /(i/) = e" - 1/ - 1 
(see [11]) and 



Z[ 2) in place of and Z\ x ' in place of Z\"> . It follows 
that 

= 4 2) (T) and J< 2) (S) = j{ 2) (T). 
We trivially also have that 

00,00 

{T}. 

Now let us use a binary random variable X £ {0, 1}, which 
is independent of {Ft}, to construct a randomized stopping 
rule adapted to P t = F t V o"(X), 

(9) f = XT + (1 - 

It is easy to observe that 

and 



(i) 



,(2) 



1 r 

2 



j} 2) (T) + J 2 W (T) < 4 Z >(T) 



r(2) 



4 2) (f) = 4 2) (f) 

which implies 

j(2)(f) < jW(r) j 
by (O. Therefore the optimal solution to <£3j must satisfy 



For a fixed i, and the dynamics of (0} the CUSUM 
stopping rule is 

inf{t >0;y t (0 =4, 



(10) 
where 
(11) 



(i) 



(») (*) • 1 

u t — m t , 1 = 1, 



,N 



with ?7ij = inf s < t ui^, i = 1, 



(12) 



, AT and 
ft 







'Although f of equation (9) is measurable with respect to the enlarged 
filtration {^t}> me optimal solution to (3) must be adapted to the original 
filtration {Ft}- 



(13) J$l(T v )=Eo{~ 



a 2 t dt = f{-v). 



The fact that the worst detection delay is the same as that 
incurred in the case in which the change point is exactly is 
a consequence of the non-negativity of the CUSUM process, 
from which it follows that the worst detection delay occurs 
when the CUSUM process at the time of the change is at 
[11]. 

The CUSUM stopping rule ( TToT > is an optimal solution to 
one-dimensional problem of detecting one change-point in 
the one-dimensional equivalent of (0). The details can be 
found in [11] and [16]. It is important however to point out 
that a vital assumption necessary for the optimality of the 
CUSUM ([Toll in [11] is 



P n I J a s 2 ds = 00 J = P a 
(14) = 1. 



a° 2 ds = 00 



This assumption ensures the a.s. finiteness of the CUSUM 
stopping time (see [9]), whose physical interpretation is 
that the signal received after the change point has sufficient 
energy. We will thus assume that conditions (fl4l are satisfied 
for all processes {a^}- 

We remark here that if the TV change points were the same 
then the problem OJ is equivalent to observing only one 
stochastic process which is now iV-dimensional. Thus, in 
this case, the detection delay and mean time between false 
alarms are given by the formulas in the above paragraph. 

Returning to problem (0), it is easily seen that in seeking 
solutions to this problem, we can restrict our attention to 
stopping times that achieve the false alarm constraint with 
equality [10]. The optimality of the CUSUM stopping rule 
in the presence of only one observation process suggests 
that a CUSUM type of stopping rule might display similar 
optimality properties in the case of multiple observation 
processes. In particular, an intuitively appealing rule, when 
the detection of min{Ti, . . . , tn} is of interest, is T), = 
Tl A ... A T£, where is the CUSUM stopping rule for 
the process {zf' ; t > 0} for i — 1, . . . , N. That is, we use 
what is known as a multi-chart CUSUM stopping time [18], 
which can be written as 



(15) T h = inf {t > 0;max{y 



(i) 



where 



Vi 



(i) 



SUp l0g-7^- 

0<Ti<t GL^DO 



Ft 



(N) 



}>h} 1 



and the P T are the restrictions of the measure P T 



to to satisfy 



C[0,oo). 

It is easy to see that (fT5T l is an equalizer rule. That is, it 
satisfies ([§). This follows from the assumption that {a t } 
are the same for all i. 

Moreover, 

J^(T h ) = E ^...^r-j\af ) Ydt\ 



Eoo.O 



1 



OO, . ...OO 



(16) 



= E, 



oo,.. . ,oo,0 



This is because the worst detection delay occurs when at 
least one of the N processes does not change regime. Thus, 
the worst detection delay will occur when none of the other 
processes changes regime and due to the non-negativity of 
the CUSUM process the worst detection delay will occur 
when the remaining one processes is exactly at 0. 

Notice that the threshold h is used for the multi-chart 
CUSUM stopping rule (fT~5T > in order to distinguish it from v 
the threshold used for the one sided CUSUM stopping rule 
COll. 

In what follows we will demonstrate asymptotic opti- 
mality of ( [TBT l as 7 — » 00. In view of the constraint 
in (0, the assumption that are the same for all 

i and ( fTSI l, in order to assess the optimality proper- 
ties of the multi-chart CUSUM rule ( fT3T l, we will need 
to begin by evaluating £b,oo,...,oo { \ Jo" (aP) 2 dt\ and 

E 00 ,..., 00 {y^(aPrdt}. 

In order to demonstrate asymptotic optimality of Sl5[ we 
bound the detection delay J^l °f trie unknown optimal 
stopping rule T* by 



(17) £ ,c 



1 



(o 



{1) ^dt 



where h is chosen so that 



(18) 



E a 




(lh2 



dt 



7- 



It is also obvious that J^2(T*) is bounded from below by 
the detection delay of the one CUSUM when there is only 
one observation process, in view of the fact that 



sup 7 



(19) 



_ TN essup/:... - x {i ff{a^fdt\T f ) 

> sup ri essup E T1 I \ a^dt\Hl } j , 



> 



where a t is measurable w.r.t. the filtration generated by the 
1-dimensional process denoted by {f^}, and is the 

projection of {o/j 1 -*} on the filtration {Jt }. 

The stopping time that minimizes the right hand side is 
the CUSUM stopping rule T„ of ([Toi l, with v chosen so as 



(20) 



7- 



We will demonstrate that the difference between the upper 
and lower bounds 

E , 00 ,..., 00 l 1 - [ Th (aWfdt\ > jf L } (T*) 



> E 



cddt } , 



(21) 



is bounded by a constant as 7 — > 00, with h and v satisfying 
( fT8l and ( f20b . respectively. 

Lemma 1: Suppose that {a\ % '} are the same for all i. We 
have 



E t 



1 



0,OO,. . . ,00 



(22) 



(a{ iy fdt = [log 7 + log JV - 1 + o(l)] , 



as 7 — > 00 

Proof: Please refer to the Appendix for a sketch of the proof. 
Moreover, it is easily seen from ( fl~3T > that 



(23) 




oudt 



[log 7 - l + o(l)] 



Thus we have the following result. 

Theorem 1: Suppose that {a^" 1 } are the same for all i. 
Then the difference in detection delay J#2 of the unknown 
optimal stopping rule T* and the detection delay of Th of 
Sl5[ with h satisfying ( TT~8T > is bounded above by 

logiV, 

as 7 — > 00. 

Proof: The proof follows from Lemma [T] and d23l . 

Remark: Since J^2(Th) increases without bound as 7 — > 
00, Theorem [T] asserts the asymptotic optimality of Th- 
ill. CONCLUSIONS AND FUTURE WORKS 

In this paper we have demonstrated the asymptotic op- 
timality of the minimum of N CUSUMs for detecting the 
minimum of N different change points in a coupled system 
of N sensors which receive sequential observations from 
the environment. We have allowed for a general dependence 
structure in the observations and we have shown that the 
A^-CUSUM stopping rule is asymptotically optimal, as the 
mean time to the first false alarm increases without bound, 
in detecting the minimum of N different change-points 
in the sense that it minimizes a worst average Kullback- 
Leibler divergence criterion. This has been seen by the fact 
that the difference in detection delay of the proposed N- 
CUSUM stopping rule and the unknown optimal stopping 
rule is bounded above by the constant log N. An interesting 
extension of this work would incorporate the fact that the 



distributions of the signals received in different sensors may 
be different. In this case the fact that the optimal stopping 
rule has to be an equalizer rule (i.e. satisfy dHJ) would 
determine the optimal selection of thresholds in each sensor 
which in the general case should be different. 
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V. Appendix 

As an illustration for the general case, let us prove the 
result for N — 2. 

We begin by deriving the Partial Differential equations 
satisfied by the functions 

. f(£,y)=i?L £ i{i/o Th («t 1) ) 2d< j 
where the subscript (x, y) indicates the indicates 
the initial value of the pair of CUSUM processes 
(yl^, yl )• With this representation, it is easy to 
see that £ ,oo {5 Jq H (a^fdtj = 5(0,0) and 

£00,00 [\ ^ h {a ( t ] ) 2 dt^ = T(0,0). In the sequel we 

will denote by T x , Ty, S x , Sy, the first partial derivatives of 
T and S with respect to x and y respectively. Similarly, we 
will denote by T xx , Tyy, S xx , Syy the second partials. 
Using Ito's rule [14], we have 



(24) 



f(y t (1) ,y t (2) )-my) 



■aWfydwW 



f v (y£\yW)dm 



(2) 



) (Txx ~f~ Tyy T x Ty^dS, 



where the arguments of each of the above functions are 
(yi ,yi ) when omitted and where in the last line we use 
the fact that a« are of the same form for all i. Evaluating 
the above equation at Th and taking expectations under the 
£00,00 measure, while using conditions (0, (fT4l . we obtain 



yy - Tx -T y = -1, (x, y) 6 V = [0, h] , 



that T has to satisfy 

(25) f xx + f, 
with the Dirichlet boundary conditions 

(26) f(x,y)\ x=h =f(x,y)\ y=h 
and the Neumann boundary conditions 

df 



(27) 



dT 

dx 



x=0 



dy 



0. 



Notice that the Neumann boundary conditions ensure that the 
terms in the second line of (1241) vanish. Similarly, S satisfies 



(28) S XX + Syy 



S X Sy 



-1, (x,y) eV= [0,h] 2 , 



with the same boundary conditions as f. 

We can now introduce a change of variable x 
y = x- By setting e = 4, we can rewrite $25[ as 



f anc * 



(29) 

t 2 f x , 



+ e 2 T v 



eT x - eT v 



-1, (x,j)eD=[0,lf, 



with the Dirichlet boundary conditions 

(30) f(x,v)\ x= i=f(x,y)\y=i=0 

and the Neumann boundary conditions (l27T i. By letting eT 
T, we now obtain 



-1, (x,y) eP= [0,lf 



(31) 

£J-XX + ^Tyy T x Ty 



with T satisfying the Dirichlet boundary conditions of 
and the Neumann condition of d27l >. We are interested in the 
asymptotics of T(0, 0) for small values of e (or equivalently 
large values of h). T(0, 0) can be interpreted as the mean 
exit time of a particle that is placed initially at the origin, 
with reflecting boundaries along the axes and absorbing 
boundaries on the top and the right side of the rectangular 
domain T>. In order to solve the above problem, we note, 
that we can write the solution T as 



(32) 



T{x,y) 



G{x, y, t) dt 



where G denotes the probability that the particle, initially 
placed at a point (x, y) in V leaves the domain V at a time 
r > t. The evolution of G is then governed by the backward 
Fokker-Planck equation: 



(33) 



— - eAG 

dt dx dy 



Boundary conditions for G correspond to boundary condi- 
tions of T and the initial condition of G is given by the fact 
that, at t = 0, G has the value 1 in V. 

In the case of the particular geometry under consideration, 
we can find an approximate solution to d33l and use this to 
find T. This is due to the fact that, for a rectangular domain 
under the assumptions given, the solution of d33l can be 
found by simple separation of variables, hence we find G as 
a product of the form 

(34) G(x,y,t)=G 1 (x,t)G 2 (y,t), 
where G\ satisfies the equation 



(35) 



dG x d 2 G x dG x 
e- 



dt 



dx 2 



dx 



on [0, 1] with reflecting boundary at and absorbing bound- 
ary at 1. The same holds for G2 with respect to the variable 

y- 

In order to solve d35l >. we apply a Laplace transform in 
t and obtain for G\ = Gi(s,x) the ordinary differential 
equation 



(36) 



sGx - 1 = eG'l - G[. 



Making use of the fact that e is small, we find as leading 
order approximation to the solution of 



For this approximation it is simple to find the inverse Laplace 
transform to obtain 



(38) 



Gi(0,t) Riexp 



1 

-~e 
e 



1/, 



Using this formula for both Gi(0, t) and 6*2(0, t) we obtain 
immediately for T(0, 0) in OTb the asymptotic formula 

(39) T(0,0) ' 



from which it follows that T(0,0) « Setting 
T(0, 0) = 7, and using h = -, we further obtain that as 
7 — > 00, h rs log 7 + log 2. 

For the asymptotic formula of S(0, 0) of d28| ), we also 
let S = eS and use the same change of variable as in the 
previous case. The only difference is that we have to solve 
for Gi the different problem 

(40) sGi - 1 = eG'( + G[. 

In this case, the approximate solution takes the form 

1 



(41) 



Gi(0,s) 



{1 + 8). 



From here we obtain after inverse Laplace transform 

(42) Gi(0, t) w W(l - t) - e (S(t - 1) + 6'{t - 1)) , 

where H denotes the Heaviside function and 6 denotes the 
Dirac delta distribution. Combining the formulas d42l for G\ 
and d38l > for G2 we find as approximation of 5(0, 0) for the 
problem 



(43) 5(0,0) 



Gi(0,t)G 2 (0,t)dt fsl-e, 



from which we obtain 5(0, 0) ~ j — l = h — l, from which 
it follows that 5(0, 0) « log 7 + log 2 — 1 as 7 — > 00. 

Using the same derivational steps it is possible to gener- 
alize to N sensors. In particular, in this case the integrand 
for T(xi, . . . ,xn) in d32l becomes the product (see d34"i l) 
of N functions, G\(x\, t), . . . , Gn(xn, t) each of which 
satisfies equation ( f35l > with the same boundary conditions 
with respect to their respective variables. Their respective 
Laplace transforms satisfy ( |36*l l. This leads to 



(44) 



T(0,...,0) 



Similarly, 5(0, . . . , 0) takes the form (143b . with integrand 
consisting of the product of N functions, the Laplace trans- 
form of the first of which satisfies d40b and the Laplace 
transforms of the others satisfy (f36b . Following the same 
steps as before, this leads to the asymptotic formula 

(45) 5(0,...,0)wl-e. 

Using j44j and (gU, we derive 5(0, 0) « log 7 + log N - 1 
as 7 — -> 00. This completes the proof of Lemma Q] 



